Introduction

This guide has been written for you to follow along, you will create your own version of this file as you go. It will work best if you have two screens - have this document open on one screen, and R Studio with the code and output on the other (main) screen.

Before we go any further, copy the code below into your console and run it. This will identify and install any missing packages that are required for this script to run correctly.

## If a package is installed, it will be loaded. If any 
## are not, the missing package(s) will be installed from CRAN

## First specify the packages of interest
packages = c("dplyr", "tidyr", "lubridate", "zoo",
             "ggplot2", "DT", "reactable", "plotly")

## Now installs if required
package.check <- lapply(
  packages,
  FUN = function(x) {
    if (!require(x, character.only = TRUE)) {
      install.packages(x, dependencies = TRUE)
      }
  }
)

Start by opening the Rmd file “developing script.Rmd” in R Studio. This is an unformatted version of the markdown file that created this guide. You run a markdown by ‘knit’ing it (look for the icon - it’s a ball of wool with a knitting needle in it). Click the ’knit’ button and look at the document it creates. How does it compare to this file? (The first time you knit an Rmd script you may receive warning messages about packages that are required but aren’t yet installed. We have tried to pre-empt this by running the code above, but if it still happens install those packages and knit again.)

As you work through this guide you will add formatting and content to your script until your knitted document looks like this one!

This guide does not go into the magic that happens between hitting the ‘knit’ button and the output file being produced. There are plenty of resources on the internet if you want to explore this further. For more details about all things markdown visit http://rmarkdown.rstudio.com.

This guide will start by looking at formatting your document, then will add in some data and charts, and improve how they look. The final section adds some interactivity to some of the charts.

What is R Markdown?

Markdown is a way to create and format documents, the types of which include (but aren’t limited to) html, pdf, Word, dashboards, slides etc. Two major benefits of creating these using R Markdown are reproducibility and automation. Code can be embedded, so the creation of properly formatted text plus charts and visualisations all happen with one click. This training guide will focus exclusively on producing html output, but hopefully by the end of it you will have the confidence and skills to investigate other output formats.

Formatting

You will no doubt be familiar with headers.

They can get smaller

and smaller

and even smaller still

R markdown uses the # symbol to create headers. A level one header uses one #, use ## for a level two header etc. Note how, in the script, the background colour of the row changes.

Exercise 1: Turn ‘Introduction’, ‘What is R Markdown?’ and ‘Formatting’ into level 1 headers. Apply subsequent header levels to the text that follows ‘Formatting’.

Italics, bold and super/sub scripts

The asterisk is used to make text bold or italic (or both). For italics, apply one * either side of the word/phrase, for bold use ** either side of the word/phrase. If you want both italics and bold, apply ***.
In the same way, apply ^ for superscript and ~~ for strike-through (note the 2 ~~ symbols).

Exercise 2:

Make this text italic

make this text bold

Really emphasise this by making it bold and italics

Apply a superscript to part of this text

Strike through this text

Lists

Just as in Word, you can create bulleted and numbered lists. For bullet points, use *, - or + at the start of each point to be bulleted. Indent and use + for sub-bullets.

Exercise 3a: turn the first list into a bulleted list with sub-bullet, and the second list into a numbered list.

  • Asterisk, dash or + sign for bullet points
  • and other lists
    • indent for sub-lists
  • and more points if that’s what you want

Exercise 3b: Create numbered lists by starting each line with (@). The numbers will dynamically adjust.

  1. markdown does the numbering for you
  2. so there’s no chance of mis-counting after an edit
  3. it will automatically update
  4. isn’t that clever!

You can use numbers to make a numbered list, but you need to add a full-stop after each number.

  1. you can choose to number lists yourself
  2. it must start at number 1,
  3. but after that it will number consecutively even if you wanted this item to be numbered e.g. 5

Paragraphs & Spacing

To create a line break you have to double space after the last sentence then press return.

Even if you press enter in the markdown script, it won’t appear as a new paragraph in the document unless you end the sentence with a double space. This line looks like it should be a new paragraph in the markdown code, but it isn’t once it renders to html.

Make text stand out by using block quotes
Preceed each line with >
This can be useful for headline statements
But remember to put two spaces after each sentence

Exercise 4: turn the previous four lines into block code

Use 3 asterisks to create a line break below this sentence.


The asterisk is turning out to be quite a versatile symbol!

Appearance

Your document should be looking more like this one, but it’s still missing the table of contents. We add that in the YAML header. While we are there, we will change the theme. Choose from any of the Bootswatch free themes, or download other templates.

Exercise 6: amend your YAML header as below. Note the indentation after ‘output’, and further indentation after ‘html_document’ - in YAML, this is essential so it knows to link it all back to the html document output. See if you like any of the other free themes from Bootswatch (cerulean, cosmo, cyborg, darkly, flatly, journal, lumen, paper, readable, sandstone, simplex, slate, spacelab, superhero, united or yeti).

output: 
  html_document:
    theme: cosmo
    toc: true
    toc_float: yes

While you are in the YAML header, change the author and date too.

Dealing with data

Now things start to get really interesting! We are going to import some data, do some wrangling and create some outputs to display. We will be using R to do all this, but R Markdown supports other languages too.

There are two ways of inserting code into a document: blocks of code are wrapped up in a code chunk, but you can also insert objects inline in text. You create a code chunk using the Insert option at the top of the pane, the keyboard shortcut Ctrl + Alt + I or manually, by wrapping it in three backticks at the start and end of the chunk. You also need to use curly brackets to declare which language the code is in.

```{r add-chunk-name}

# insert R code here

```

It’s good practice to name your code chunks (each chunk needs a unique name), keep it short but descriptive of what it is doing. Don’t use spaces, dots or underscores in chunks - use “-” if you need to use a separator.

Adding chunk options within the curly brackets can e.g. hide the code, only show the results, evaluate the code or not (plus many more options). If you want the same rules to apply to all code chunks you would include it in the global options. Look at the very first code chunk in the script - include = FALSE excludes the code chunks from the output by default.

Use single backticks to run code inline with text. You still need to include r but this time it doesn’t need to be in curly brackets. For example, to print today’s date use `r format(Sys.time(), '%d %B %Y')` inline with the rest of your text.

Exercise 7:

  1. If you don’t already have the package installed, run remotes::install_github("Public-Health-Scotland/phsopendata", upgrade = "never") in the console. This will install the PHS Open Data package, which we will use to import some data.

  2. Create a code chunk, name it and paste the following code into the chunk.

## libraries
library(dplyr)
library(tidyr)
library(phsopendata)    # easily access open data from PHS
library(lubridate)      # parse dates correctly
library(zoo)

## one way to get data from the PHS opendata repository (https://www.opendata.nhs.scot/) is by the resource id, found in the metadata section

# importing the last 12 months of monthly A&E data
monthly_ae <- get_resource(res_id = "2a4adc0a-e8e3-4605-9ade-61e13a85b3b9") %>% 
  mutate(month_end = ym(Month)) %>% 
  filter(between(month_end, max(month_end)- months(12), max(month_end))) %>% 
  select(TreatmentLocation, DepartmentType, month_end, NumberOfAttendancesAggregate, NumberMeetingTargetAggregate)

# grouping for use later on
monthly_ae_gp <- monthly_ae %>%
  group_by(DepartmentType, month_end) %>% 
  summarise("Number Of Attendances" = sum(NumberOfAttendancesAggregate), 
            "Number Meeting Target" = sum(NumberMeetingTargetAggregate)) %>% 
  pivot_longer(
    cols = starts_with("Number"),
    names_to = "type",
    values_to = "count"
  ) %>% 
  ungroup()

When you knit, it doesn’t look like anything has happened. But behind the scenes, the data has been imported and filtered, and the object monthly_ae is waiting to be used.

In the next code chunk we will create some simple plots, and also introduce tabsets. These are a useful way to include lots of content without the document becoming excessively long.

Monthly Attendances at A&E

To create tabsets add {.tabset} to the header row. Then create two (or more, how many tabs do you want?) headers at the next level down, with a code chunk in each section.

Attendances at EDs

You can add commentary before the next section heading.

Attendances at MIUs

And different commentary for this chart.

You aren’t limited to just displaying charts in tabsets - you might want to also include the data behind a chart, in a table.

Exercise 8:

  1. add {.tabset} to the Monthly Attendances at A&E level 2 header.
  2. create a level 3 header, Attendances at EDs
  3. paste the following code into a code chunk. Add some suitable commentary below the code chunk
library(ggplot2)

pl_colours <- c("darkred", "steelblue")

plot1 <-ggplot(monthly_ae_gp %>% filter(DepartmentType == "Emergency Department"), 
               aes(x = month_end, y = count, group = type, colour = type)) +
  geom_line() +
 expand_limits(y = 0) +
 labs(title = "Number of Attendances and 4 hour target at Emergency Departments",
      x = "month ending",
      y = "number of Attendances") +
  scale_colour_manual(values = pl_colours) +
 theme_minimal()

 plot1
  1. repeat with another level 3 header, titled Attendances at MIUs
plot2 <-ggplot(monthly_ae_gp %>% filter(DepartmentType == "Minor Injury Unit or Other"), 
               aes(x = month_end, y = count, group = type, colour = type)) +
  geom_line() +
 expand_limits(y = 0) +
 labs(title = "Number of Attendances and 4 hour target at Minor Injury Units",
      x = "month ending",
      y = "number of Attendances") +
  scale_colour_manual(values = pl_colours) +
 theme_minimal()

plot2
  1. If you want to make it look a bit snazzy, add a bit of flair to the tabsets by adding .tabset-fade and/or .tabset-pills to {.tabset} (keep it all inside the curly brackets).

Housekeeping

By now the script is getting quite long, and can be tricky to navigate. If you’ve been good at defining sections then you can toggle to show the document outline on the right-hand side of the script pane. Other points of good practice:

  • load all the libraries you need in the first code chunk. (I haven’t done that in this document, as I wanted you to see which package is used in each example)

  • similarly, load and wrangle the data early on in the script too, before you start creating content. Then the code chunks within the main text can be kept sparse, such as just calling a plot.

  • run each chunk in RStudio as you create it to check it works - it’s easier to troubleshoot this way, than following an error message created when trying to knit.

  • if there’s a lot of processing to be done then create this in another R script instead, and ‘call’ it into the markdown by using source.

  • be very careful using filepaths - the ‘start point’ of filepaths can be different when running a chunk locally than when knitting. The easiest way to get round that is to use the package here, which always starts at the location of the project directory.

Adding some interactivity

It’s all well and good being able to print a chart or table, but it’s still a static object. Wouldn’t it be great if readers could interact with those objects? The beauty of rendering the markdown script to html is that we can! What follows are just a few examples of packages that can add some interactivity to your reports. There aren’t any exercises, but follow along with the script to see how to apply them. In your script the following code chunks are currently set to not run when you knit - delete the code eval = FALSE from each code chunk (or change it to TRUE) so that the code does run from now on.

DT (Datatable)

This default output for displaying tables using DT is shown below, but you can customise it (add options for exporting the data, change how many rows are displayed, plus lots more). However, you can see it doesn’t like the yearmon date format, and displays it as a numeric. (There is probably a workaround…)

Reactable

With reactable, you can group data without having to hard code it first, using the built-in aggregate functions or define your own. Clicking on a grouped item will unroll it.

Plotly

We can apply all sorts of tricks to make our ggplots look fancy, but at the end of the day they are still just static plots. Plotly can bring them to life. Hovering over the lines will display data points, and you can zoom into a specific area, download the plot, plus other options available through the buttons top-right of the plot..

Plotly may not be able to render more advanced ggplots using the ggplotly function, but you can build plots from scratch in plotly. The language and terminology is different than ggplot, but ultimately it gives you a lot more functionality.

What Next?

RStudio Visual Editor

Newer versions of RStudio (v1.4 onwards) have a visual editor, which means you can focus more on the content of your document and less on how to code it. It won’t work with this script so let’s see what happens with a fresh R Markdown script.

Go to File > New File > R Markdown… In the pop-up box give the file a name, add an author and keep the output as HTML. Have a look at the script that is generated. You should recognise the different aspects of a markdown script: YAML, text, code chunks.

Now look at the top left corner, at the toolbar below the script tabs. This is where you can switch between the Source and Visual editors.

In Visual Editor mode you can use the editor toolbar to format text, add images, tables, links, code chunks, citations, footnotes etc. More details can be found in RStudio’s github page. The bookdown package adds extra functionality such as cross-referencing.

Taking it further

This is only a brief introduction to R Markdown, there is so much more to explore! The suggestions below are outwith the scope of this guide, and are only a tiny fraction of what is possible.

  • Automate the same report for multiple groups (e.g. ICSs, LAs, hospitals, …). These are done by defining the parameters. Chapter 15 in R Markdown: the definitive guide describes this in more detail. This method can also be applied to other types of outputs, such as pdf, Word or Powerpoint documents.

  • Flexdashboard is an extension package to markdown that allows you to build dashboards in html. These are useful to display data visualisations, and include other components such as data tables, interactive maps, gauges, value boxes etc. It can be made up of multiple pages, or you can control how analysis and its conclusions are unveiled through the use of a storyboard.

  • Further customise the appearance of the html document by adding CSS styles and html formatting. For instance (look at the source code to see how this effect is created):

you can change the background colour of a block to further highlight key statements

  • and change the font colour for specific words/sentences

  • making your conclusions really stand out